15 research outputs found

    Online Discrimination of Nonlinear Dynamics with Switching Differential Equations

    Full text link
    How to recognise whether an observed person walks or runs? We consider a dynamic environment where observations (e.g. the posture of a person) are caused by different dynamic processes (walking or running) which are active one at a time and which may transition from one to another at any time. For this setup, switching dynamic models have been suggested previously, mostly, for linear and nonlinear dynamics in discrete time. Motivated by basic principles of computations in the brain (dynamic, internal models) we suggest a model for switching nonlinear differential equations. The switching process in the model is implemented by a Hopfield network and we use parametric dynamic movement primitives to represent arbitrary rhythmic motions. The model generates observed dynamics by linearly interpolating the primitives weighted by the switching variables and it is constructed such that standard filtering algorithms can be applied. In two experiments with synthetic planar motion and a human motion capture data set we show that inference with the unscented Kalman filter can successfully discriminate several dynamic processes online

    Monolithic Integration of Silicon Nanowires With a Microgripper

    Get PDF
    Si nanowire (NW) stacks are fabricated by utilizing the scalloping effect of inductively coupled plasma deep reactive ion etching. When two etch windows are brought close enough, scallops from both sides will ideally meet along the dividing centerline of the windows turning the separating material column into an array of vertically stacked strings. Upon further thinning of these NW precursors by oxidation followed by oxide etching, Si NWs with diameters ranging from 50 nm to above 100 nm are obtained. The pattern of NWs is determined solely by photolithography. Various geometries ranging from T-junctions to circular coils are demonstrated in addition to straight NWs along specific crystallographic orientations. The number of NWs in a stack is determined by the number of etch cycles utilized. Due to the precise lithographic definition of NW location and orientation, the technique provides a convenient batch-compatible tool for the integration of NWs with MEMS. This aspect is demonstrated with a microgripper, where an electrostatic actuation mechanism is simultaneously fabricated with the accompanying NW endeffectors. Mechanical integrity of the NW–MEMS bond and the manipulation capability of the gripper are demonstrated. Overall, the proposed technique exhibits a batch-compatible approach to the issue of micronanointegration

    Monolithic integration of Si nanowires with metallic electrodes: NEMS resonator and switch applications

    No full text
    The challenge of wafer-scale integration of silicon nanowires into microsystems is addressed by developing a fabrication approach that utilizes a combination of Bosch-process-based nanowire fabrication with surface micromachining and chemical-mechanical-polishing-based metal electrode/contact formation. Nanowires up to a length of 50 mu m are achieved while retaining submicron nanowire-to-electrode gaps. The scalability of the technique is demonstrated through using no patterning method other than optical lithography on conventional SOI substrates. Structural integrity of double-clamped nanowires is evaluated through a three-point bending test, where good clamping quality and fracture strengths approaching the theoretical strength of the material are observed. Resulting devices are characterized in resonator and switch applications-two areas of interest for CMOS-compatible solutions-with all-electrical actuation and readout schemes. Improvements and tuning of obtained performance parameters such as resonance frequency, quality factor and pull-in voltage are simply a question of conventional design and process adjustments. Implications of the proposed technique are far-reaching including system-level integration of either single-nanowire devices within thick Si layers or nanowire arrays perpendicular to the plane of the substrate

    From Birdsong to Human Speech Recognition: Bayesian Inference on a Hierarchy of Nonlinear Dynamical Systems

    Get PDF
    <div><p>Our knowledge about the computational mechanisms underlying human learning and recognition of sound sequences, especially speech, is still very limited. One difficulty in deciphering the exact means by which humans recognize speech is that there are scarce experimental findings at a neuronal, microscopic level. Here, we show that our neuronal-computational understanding of speech learning and recognition may be vastly improved by looking at an animal model, i.e., the songbird, which faces the same challenge as humans: to learn and decode complex auditory input, in an online fashion. Motivated by striking similarities between the human and songbird neural recognition systems at the macroscopic level, we assumed that the human brain uses the same computational principles at a microscopic level and translated a birdsong model into a novel human sound learning and recognition model with an emphasis on speech. We show that the resulting Bayesian model with a hierarchy of nonlinear dynamical systems can learn speech samples such as words rapidly and recognize them robustly, even in adverse conditions. In addition, we show that recognition can be performed even when words are spoken by different speakers and with different accents—an everyday situation in which current state-of-the-art speech recognition models often fail. The model can also be used to qualitatively explain behavioral data on human speech learning and derive predictions for future experiments.</p></div

    Performance of the recognition model in “cocktail party” situations.

    No full text
    <p>A module is trained on an auditory sentence (“She argues with her sister”) without competing speakers and tested for recognition of this sentence in three conditions: <b>Left column)</b> No competing speaker, <b>Middle column)</b> one competing speaker, and <b>Right column)</b> three competing speakers. Each column shows the second level dynamics, first level dynamics and cochleagram with arbitrary units in neuronal activation. Second level dynamics were successfully reconstructed for the single speaker and also, to an extent, for the speech sample with one competing speaker. In the case of three competing speakers, the module was not able to reconstruct the second level dynamics completely, but showed some signs of recovery at the beginning and at the end of the sentence. Note that the increasing difficulty in reconstruction of the speech message from one to three speakers is not reflected in the prediction errors at the first level (dashed lines), but becomes obvious at the second level.</p

    Schema of ideal precision settings, at the first and second levels of a module, for learning and recognition under noise.

    No full text
    <p>The precision of a population at each level is indicated by the line thickness around the symbols, and the influence of a population over another is indicated by arrow strength. <b>A</b>) During learning, the precision ratio at the first level (precision of the sensory states, i.e., causal states, over precision of the internal (hidden) dynamics) should be high. Consequently, the internal dynamics at the first level are dominated by the dynamics of the sensory input. At the second level, a very high precision makes sure that the module is forced to explain the sensory input as sequential dynamics by updating (learning) the connections between first and second levels (the <i>I</i>'s in the first line of <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003219#pcbi.1003219.e043" target="_blank">Equation 2</a>). <b>B</b>) Under noisy conditions, the sensory input is not reliable and recognition performance is best if the precision at the sensory level is low compared to the precision of the internal dynamics at both levels (low sensory/internal precision ratio). This allows the module to rely on its (previously learned) internal dynamics, but less-so on the noisy sensory input. For the exact values of the precision settings in each scenario, see <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003219#pcbi.1003219.s001" target="_blank">Text S1</a>.</p

    Generated neuronal network activity at the first level after learning.

    No full text
    <p>The solid lines represent the cochleagram dynamics obtained from the stimulus (the word “zero”, the same stimulus as shown in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003219#pcbi-1003219-g001" target="_blank">Figure 1</a>) that the module had to learn. Neuronal activity was normalized to one. The dashed lines represent the neuronal activity generated by the module after learning and shows that the module has successfully learned the proper <i>I</i> vectors between two levels.</p

    Word Error Rates (WER) for isolated digit recognition task reported in the literature for different recognition methods.

    No full text
    <p>Note: DEM (Dynamic Expectation Maximization) is the recognition system used in this paper; LSTM (Long Short-Term Memory) network was introduced in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003219#pcbi.1003219-Graves1" target="_blank">[111]</a>, LSM (Liquid State Machine) with 1232 neurons was reported in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003219#pcbi.1003219-Verstraeten2" target="_blank">[51]</a> and was improved (LSM 2) in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003219#pcbi.1003219-Verstraeten1" target="_blank">[18]</a>. The results for the state-of-the-art speech recognition system using HMM (Hidden Markov Model) were reported in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003219#pcbi.1003219-Verstraeten1" target="_blank">[18]</a>. OT (Occurrence Time) features were used in <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003219#pcbi.1003219-Zavaglia1" target="_blank">[103]</a>.</p

    Accent adaptation of the recognition model.

    No full text
    <p><b>A</b>) The cochleagrams represent two utterances of “eight”. A module originally learned the word “eight” spoken with a British (North England) accent (top) and then recognized an “eight” spoken with a New Zealand accent (bottom). <b>B</b>) The module trained on the British accent was allowed to adapt to the New Zealand accent with the corresponding precision values for the first level sensory (causal) and internal (hidden) states (sensory log-precision: and internal log-precision: where from left to right). For each precision ratio, we plotted the reduction in prediction error (of the causal states, see <a href="http://www.ploscompbiol.org/article/info:doi/10.1371/journal.pcbi.1003219#s2" target="_blank">Model</a>) after five repetitions of the word “eight” spoken with a New Zealand accent. As expected, accent adaptation was accomplished only with high sensory/internal precision ratios (resulting in greatly reduced prediction errors) whereas no adaptation occurred (prediction errors remained high) when this ratio was low.</p
    corecore